*******************************************************************************************************************************                                                                         
* Description: STATA code to fit random intercepts (RI) and random intercepts and random slopes (RI + RS) logistic mixed effects
*			   models assuming:
*									 (i) the random effects are normally distributed, or 
*									 (ii) unspecified random effects as estimated using the Gateaux Derivative in GLLAMM
* 
*			   All analyses are estimated using GLLAMM procedure in STATA
*    
* Application: HILDA dataset with time as (wave - 1)/10  and  Age at Wave 1 as (Age at Baseline - 30)/10                                                                                       
*                                                                                                                                                                                                
* Programmer : L. Marquart                                                                            
*********************************************************************************************************************************

** Data Access: The Household, Income and Labour Dynamics in Australia (HILDA) dataset used for the analyses can be requested for free 
**				 from the Melbourne Institute (https://melbourneinstitute.unimelb.edu.au/hilda/for-data-users/ordering-hilda-survey-data). 
**				 The first 11 waves of data for women aged between 30-44 at the first wave were used for this analysis (See Section 4.1 
**				 of the paper for more details).


**Call in the HILDA dataset for the subjects with complete data and monotone missingness*****.
use "Motivatingexample.dta"

  ************************************************************************
  ** Step 1: Derive HILDA variables to be used in analyses:
  ************************************************************************
  * Explanatory variables:
  *    i) Marital Status 3 categories: Married or defacto (ref), Seperated/Divorced or Widowed, Single 
  *    ii) Baseline Education 3 categories: Bachelor or higher (ref), Certificates and Diplomas or Year 12,  Year 11 or less
  *    iii) Dependent Children: No dependent Children (ref), Youngest dependent child<5, Youngest dependent child 5-24


*******************************************
***Creating key explanatory variables:
 ** (A) Marital status with 3 categories
   ** 0 = Married/ De facto
   ** 1 = Seperated/Divorced/Widowed
   ** 2 = Never Married/ Single
  generate maritalStat3cat_analysis=.
  replace maritalStat3cat_analysis=0 if mrcurr==1|mrcurr==2 
  replace maritalStat3cat_analysis=1 if mrcurr==3|mrcurr==4|mrcurr==5
  replace maritalStat3cat_analysis=2 if mrcurr==6 

 ** (B) Dependent Children
   ** 0 = No dependent children
   ** 1 = Youngest dependent child < 5 years old
   ** 2 = Youngest dependent child >=5 years old
  gen Children_combined3cat=.
  replace Children_combined3cat=0 if hhd0_4==0 & hhd5_9==0 & hhd1014==0 & hhd1524==0
  replace Children_combined3cat=1 if hhd0_4>=1
  replace Children_combined3cat=2 if hhd0_4==0 & (hhd5_9>0| hhd1014>0| hhd1524>0)
  
  
 ** (C) Education at Baseline (3 categories)
   ** 0 = Bachelor degree or higher
   ** 1 = TAFE or Year 12
   ** 2 = Year 11 or less
  
 **Highest education level recieved 0 = Bachelor or higher (i.e. grad dip or postgrad), 1= certificats or diplomas),
 **									2= Grade 12 and 3=Year 11 or below**
  generate ed4cat=.
  replace ed4cat=0 if edhigh>=1 & edhigh<=3
  replace ed4cat=1 if edhigh>=4 & edhigh<=7
  replace ed4cat=2 if edhigh==8 
  replace ed4cat=3 if edhigh==9 
  

 ssc install carryforward
**Highest Education as at baseline
 gen education_baseline=ed4cat if wave==1
 bysort xwaveid: carryforward education_baseline, gen(educbaseline)
 
  gen Education_Baseline3cat=.
  replace Education_Baseline3cat=0 if  educbaseline==0
  replace Education_Baseline3cat=1 if educbaseline==1|educbaseline==2
  replace Education_Baseline3cat=2 if educbaseline==3
 
 

** Install GLLAMM 
 ssc install gllamm
 
  
 ** Generate time and age at baseline variables 
 gen wave_minus1Per10=(wave-1)/10
 gen ageBaselineper10=(AgeBaseline-30)/10
 
****************************************************************
***************************************************
* ANALYSIS 1: 
* RANDOM INTERCEPTS LOGISTIC MIXED MODEL
***************************************************
****************************************************************


 ***************************************************************************************
 ** A) Fit model assuming random effects are normally distributed:
 ***************************************************************************************
  xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat  if hgsex==2 , 
	  i(xwaveid) family(binom) link(logit) ip(g) nip(20) adapt
 
	 
 ***************************************************************************************
 ** B) NPMLE of random intercept logistic model estimated using Gateaux derivative 
 ***************************************************************************************


 **		Start with a single mass point - 

     xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , 
		 i(xwaveid) family(binom) link(logit) nip(1) ip(f)
	
		**Store the estimates from 1 mass point
	matrix a=e(b)
    local ll=e(ll)
	local k=e(k)
	
	**Assess whether 2 mass points is feasible - based on the fit from the model with 1 mass points as starting values 
	**											(use the Gataeux option, with grid +/- 5 SD and 30 grid points)
	 xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) ip(f) nip(2) lf0(`k' `ll') gateaux(-16.79 16.79 30) from(a)	

	**Store the estimates from 2 mass points
	matrix a=e(b)
    local ll=e(ll)
	local k=e(k)
	
		**Assess whether 3 mass points is feasible - based on the fit from the model with 2 mass points 
		**											(use the Gataeux option, with grid +/- 5 SD and 30 grid points)
	 xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) ip(f) nip(3) lf0(`k' `ll') gateaux(-16.79 16.79 30) from(a)	

	**Store the estimates from 3 mass points
	matrix a=e(b)
    local ll=e(ll)
	local k=e(k)
	
		**Assess whether 4 mass points is feasible - based on the fit from the model with 3 mass points 
		**											(use the Gataeux option, with grid +/- 5 SD and 30 grid points)
	 xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat ,  i(xwaveid) family(binom) link(logit) ip(f) nip(4) lf0(`k' `ll') gateaux(-16.79 16.79 30) from(a)	

	
	**Store the estimates from 4 mass points
	matrix a=e(b)
    local ll=e(ll)
	local k=e(k)
	
		**Assess whether 5 mass points is feasible - based on the fit from the model with 4 mass points 
		**											(use the Gataeux option, with grid +/- 5 SD and 30 grid points)
	 xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) ip(f) nip(5) lf0(`k' `ll') gateaux(-16.79 16.79 30) from(a)	
	 


	**Store the estimates from 5 mass points
	matrix a=e(b)
    local ll=e(ll)
	local k=e(k)
	
		**Assess whether 6 mass points is feasible - based on the fit from the model with 5 mass points 
		**											(use the Gataeux option, with grid +/- 5 SD and 30 grid points)
	 xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) ip(f) nip(6) lf0(`k' `ll') gateaux(-16.79 16.79 30) from(a)	
	 
		**Store the estimates from 6 mass points
	matrix a=e(b)
    local ll=e(ll)
	local k=e(k)
	
		**Assess whether 7 mass points is feasible - based on the fit from the model with 6 mass points
		**											(use the Gataeux option, with grid +/- 5 SD and 30 grid points)
	 xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) ip(f) nip(7) lf0(`k' `ll') gateaux(-16.79 16.79 30) from(a)	
	 
	
			**Store the estimates from 7 mass points
	matrix a=e(b)
    local ll=e(ll)
	local k=e(k)
	
		**Assess whether 8 mass points is feasible - based on the fit from the model with 7 mass points  
		**											(use the Gataeux option, with grid +/- 5 SD and 30 grid points)
	 xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) ip(f) nip(8) lf0(`k' `ll') gateaux(-16.79 16.79 30) from(a)	
	 
		** Gateaux derivative is negative, which suggests that an additional mass point is not required.  Therefore, store the model estimates for model with 7 mass points 
	 estimates store MAR_NPMLE_K7_RI
	 
	 **Save EB posterior mean estimates
	 gllapred RI_Gataeaux_predu, u
	 
	 **Save prior probabilities - the posterior probabilities are stored in variables RI_Gataeaux_u1 - RI_Gataeaux_u7
	 gllapred RI_Gataeaux_u, p
	 




****************************************************************
***************************************************************
** ANALYSIS 2: 
** RANDOM INTERCEPTS AND RANDOM SLOPES LOGISTIC MIXED MODEL****
****************************************************************
****************************************************************

** Define the random effects, b0 and b1 
 gen cons = 1

eq b0: cons
eq b1: wave_minus1Per10



 **********************************************************************************************************
 ** A) Fit model assuming random effects are distributed as a bivariate normal
 **********************************************************************************************************
 xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat  , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) ip(g) nip(20) adapt
	
	estimates store MAR_bivariateNorm_RIRS
	gllapred eb_Normal_RIRS, u
	 

	 
	 
	 
 **********************************************************************************************************
 ** B) NPMLE of random intercept and random slope logistic mixed model estimated using Gateaux derivative 
 **********************************************************************************************************
 
 
 **	Start with a single mass point - 

     xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat  , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) nip(1) ip(f)

**Save the model estimates from model with 1 mass point**.
	matrix a=e(b)
	local ll=e(ll)
	local k=e(k)


  **Assess whether 2 mass points is appropriate using the estimates from the 1 mass point model as starting values 
		**											(use the Gataeux option, with boundary of the grid approximately
		**											+/- 5 scaled random effects with 60 grid points in each dimension)
  xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) nip(2) ip(f) from(a) gateaux(-30 30 60) lf0(`k' `ll')

  *Save the model estimates from model with 2 mass points**.
 matrix a=e(b)
 local ll=e(ll)
 local k=e(k)


  **Assess whether 3 mass points is appropriate using the estimates from the 2 mass point model as starting values	
		**											(use the Gataeux option, with boundary of the grid approximately
		**											+/- 5 scaled random effects with 60 grid points in each dimension)
  xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) nip(3) ip(f) from(a) gateaux(-30 30 60) lf0(`k' `ll')
  
   *Save the model estimates from model with 3 mass points**.
 matrix a=e(b)
 local ll=e(ll)
 local k=e(k)


  **Assess whether 4 mass points is appropriate using the estimates from the 3 mass point model as starting values  
		**											(use the Gataeux option, with boundary of the grid approximately
		**											+/- 5 scaled random effects with 60 grid points in each dimension)
 
  xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) nip(4) ip(f) from(a) gateaux(-30 30 60) lf0(`k' `ll')
  
   *Save the model estimates from model with 4 mass points**.
 matrix a=e(b)
 local ll=e(ll)
 local k=e(k)


  **Assess whether 5 mass points is appropriate using the estimates from the 4 mass point model as starting values   
		**											(use the Gataeux option, with boundary of the grid approximately
		**											+/- 5 scaled random effects with 60 grid points in each dimension)
 
  xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) nip(5) ip(f) from(a) gateaux(-30 30 60) lf0(`k' `ll')
  
   *Save the model estimates from model with 5 mass points**.
 matrix a=e(b)
 local ll=e(ll)
 local k=e(k)


  **Assess whether 6 mass points is appropriate using the estimates from the 5 mass point model as starting values 
		**											(use the Gataeux option, with boundary of the grid approximately
		**											+/- 5 scaled random effects with 60 grid points in each dimension)
 
  xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) nip(6) ip(f) from(a) gateaux(-30 30 60) lf0(`k' `ll')
  
  
   *Save the model estimates from model with 6 mass points**.
 matrix a=e(b)
 local ll=e(ll)
 local k=e(k)


  **Assess whether 7 mass points is appropriate using the estimates from the  6 mass point model as starting values 
		**											(use the Gataeux option, with boundary of the grid approximately
		**											+/- 5 scaled random effects with 60 grid points in each dimension)
 
  xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) nip(7) ip(f) from(a) gateaux(-30 30 60) lf0(`k' `ll')
  
   *Save the model estimates from model with 7 mass points**.
 matrix a=e(b)
 local ll=e(ll)
 local k=e(k)


  **Assess whether 8 mass points is appropriate using the estimates from the 7 mass point model as starting values 
		**											(use the Gataeux option, with boundary of the grid approximately
		**											+/- 5 scaled random effects with 60 grid points in each dimension)
 
  xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) nip(8) ip(f) from(a) gateaux(-30 30 60) lf0(`k' `ll')
  
   *Save the model estimates from model with 8 mass points**.
 matrix a=e(b)
 local ll=e(ll)
 local k=e(k)


  **Assess whether 9 mass points is appropriate using the estimates from the 8 mass point model as starting values 
		**											(use the Gataeux option, with boundary of the grid approximately
		**											+/- 5 scaled random effects with 60 grid points in each dimension)
 
  xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) nip(9) ip(f) from(a) gateaux(-30 30 60) lf0(`k' `ll')
	 
	 
	  *Save the model estimates from model with 9 mass points**.
 matrix a=e(b)
 local ll=e(ll)
 local k=e(k)

	 
	 	 estimates store MAR_NPMLE_K9_RIRS
	 
	 **Save EB posterior mean estimates
	 gllapred RIRS_Gataeaux_predu, u
	 

	
  **Assess whether 10 mass points is appropriate using the estimates from the 9 mass point model as starting values
  		**											(use the Gataeux option, with boundary of the grid approximately
		**											+/- 5 scaled random effects with 60 grid points in each dimension)
  xi: gllamm employment_2cat wave_minus1Per10 ageBaselineper10 i.maritalStat3cat_analysis i.Education_Baseline3cat i.Children_combined3cat , i(xwaveid) family(binom) link(logit) nrf(2) eqs(b0 b1) nip(10) ip(f) from(a) gateaux(-30 30 60) lf0(`k' `ll')
			**Gateaux derivative negative.

  
  
	 

